{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Retrievers\n",
    "A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.\n",
    "\n",
    "检索器是一个接口，它在给定非结构化查询的情况下返回文档。它比向量存储更通用。检索器不需要能够存储文档，只需返回（或检索）它们。向量存储可以用作检索器的主干，但也有其他类型的检索器。\n",
    "\n",
    "Retrievers accept a string query as input and return a list of Document's as output.\n",
    "\n",
    "Retrievers 接受一个字符串 query 作为输入，并返回一个 Document 的列表作为输出。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'辽宁的省会是沈阳。'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
    "from langchain_community.document_loaders import TextLoader\n",
    "from langchain_community.vectorstores import FAISS\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
    "\n",
    "\n",
    "loader = TextLoader('../data/city.txt')\n",
    "documents = loader.load()\n",
    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
    "texts = text_splitter.split_documents(documents)\n",
    "embeddings = OpenAIEmbeddings()\n",
    "db = FAISS.from_documents(texts, embeddings)\n",
    "retriever = db.as_retriever()\n",
    "\n",
    "template = \"\"\"Answer the question based only on the following context:\n",
    "\n",
    "{context}\n",
    "\n",
    "Question: {question}\n",
    "\"\"\"\n",
    "prompt = ChatPromptTemplate.from_template(template)\n",
    "model = ChatOpenAI()\n",
    "\n",
    "\n",
    "def format_docs(docs):\n",
    "    return \"\\n\\n\".join([d.page_content for d in docs])\n",
    "\n",
    "\n",
    "chain = (\n",
    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
    "    | prompt\n",
    "    | model\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "chain.invoke(\"辽宁的省会是哪里?\")\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langchain0_1",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
